Semi-supervised oblique predictive clustering trees
نویسندگان
چکیده
منابع مشابه
Semi-supervised Clustering
Clustering is an unsupervised learning problem whose objective is to find a partition of the given data. However, a major challenge in clustering is to define an appropriate objective function in order to to find an optimal partition that is useful to the user. To facilitate data clustering, it has been suggested that the user provide some supplementary information about the data (eg. pairwise ...
متن کاملSemi-Supervised Projected Clustering
Recent studies suggest that projected clusters with extremely low dimensionality exist in many real datasets. A number of projected clustering algorithms have been proposed in the past several years, but few can identify clusters with dimensionality lower than 10% of the total number of dimensions, which are commonly found in some real datasets such as gene expression profiles. In this paper we...
متن کاملSemi-supervised clustering methods
Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning that there is no outcome variable nor is anything known about the relationship between the observations in the data set. In many situations, however, informat...
متن کاملSemi-Supervised Learning with Trees
We describe a nonparametric Bayesian approach to generalizing from few labeled examples, guided by a larger set of unlabeled objects and the assumption of a latent tree-structure to the domain. The tree (or a distribution over trees) may be inferred using the unlabeled data. A prior over concepts generated by a mutation process on the inferred tree(s) allows efficient computation of the optimal...
متن کاملSemi supervised clustering for Text Clustering
ABSTRACT: Based on clustering algorithm Affinity Propagation (AP) I present this paper a semisupervised text clustering algorithm, called Seeds Affinity Propagation (SAP). There are two main contributions in my approach: 1) a similarity metric that captures the structural information of texts, and 2) seed construction method to improve the semisupervised clustering process. To study the perform...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PeerJ Computer Science
سال: 2021
ISSN: 2376-5992
DOI: 10.7717/peerj-cs.506